AITopics | musical instrument

Collaborating Authors

musical instrument

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Scientists find musical link to boosting brain function for life

Daily Mail - Science & techSep-1-2025, 15:58:49 GMT

Learning to play a musical instrument can protect your brain from aging, building up a defense against cognitive decline that lasts a lifetime. Researchers from Canada and China discovered older adults who had spent years playing music were better at understanding speech in noisy environments, like a crowded room, compared to those who didn't play music. Their brains worked more like younger people's brains, needing less energy to focus than older non-musicians' brains had to use to make up for age-related mental declines. Playing music was found to build up a person's'cognitive reserve,' which is like a backup system in the brain. This reserve helps the brain stay efficient and work more like a younger brain, even as someone grows older.

artificial intelligence, brain, instrument, (14 more...)

Daily Mail - Science & tech

Country:

Asia > China (0.27)
North America > Canada (0.25)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)

Genre: Research Report > New Finding (0.31)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence (0.51)

Add feedback

Two Sonification Methods for the MindCube

Liu, Fangzheng, Blanchard, Lancelot, Haddad, Don D., Paradiso, Joseph A.

arXiv.org Artificial IntelligenceJun-24-2025

In this work, we explore the musical interface potential of the MindCube, an interactive device designed to study emotions. Embedding diverse sensors and input devices, this interface resembles a fidget cube toy commonly used to help users relieve their stress and anxiety. As such, it is a particularly well-suited controller for musical systems that aim to help with emotion regulation. In this regard, we present two different mappings for the MindCube, with and without AI. With our generative AI mapping, we propose a way to infuse meaning within a latent space and techniques to navigate through it with an external controller. We discuss our results and propose directions for future work.

artificial intelligence, machine learning, mindcube, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.5281/zenodo.15698944

2506.18196

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.05)
(7 more...)

Genre: Research Report (0.70)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Education (0.68)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)

Add feedback

Designing Neural Synthesizers for Low Latency Interaction

Caspe, Franco, Shier, Jordie, Sandler, Mark, Saitis, Charalampos, McPherson, Andrew

arXiv.org Artificial IntelligenceMar-14-2025

Neural Audio Synthesis (NAS) models offer interactive musical control over high-quality, expressive audio generators. While these models can operate in real-time, they often suffer from high latency, making them unsuitable for intimate musical interaction. The impact of architectural choices in deep learning models on audio latency remains largely unexplored in the NAS literature. In this work, we investigate the sources of latency and jitter typically found in interactive NAS models. We then apply this analysis to the task of timbre transfer using RAVE, a convolutional variational autoencoder for audio waveforms introduced by Caillon et al. in 2021. Finally, we present an iterative design approach for optimizing latency. This culminates with a model we call BRAVE (Bravely Realtime Audio Variational autoEncoder), which is low-latency and exhibits better pitch and loudness replication while showing timbre modification capabilities similar to RAVE. We implement it in a specialized inference framework for low-latency, real-time inference and present a proof-of-concept audio plugin compatible with audio signals from musical instruments. We expect the challenges and guidelines described in this document to support NAS researchers in designing models for low-latency inference from the ground up, enriching the landscape of possibilities for musicians.

compression ratio, latency, receptive field, (17 more...)

arXiv.org Artificial Intelligence

2503.11562

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Rhode Island (0.04)
North America > Mexico (0.04)
(21 more...)

Genre: Research Report (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SoundMorpher: Perceptually-Uniform Sound Morphing with Diffusion Model

Niu, Xinlei, Zhang, Jing, Martin, Charles Patrick

arXiv.org Artificial IntelligenceDec-16-2024

We present SoundMorpher, an open-world sound morphing method designed to generate perceptually uniform morphing trajectories. Traditional sound morphing techniques typically assume a linear relationship between the morphing factor and sound perception, achieving smooth transitions by linearly interpolating the semantic features of source and target sounds while gradually adjusting the morphing factor. However, these methods oversimplify the complexities of sound perception, resulting in limitations in morphing quality. In contrast, SoundMorpher explores an explicit relationship between the morphing factor and the perception of morphed sounds, leveraging log Mel-spectrogram features. This approach further refines the morphing sequence by ensuring a constant target perceptual difference for each transition and determining the corresponding morphing factors using binary search. To address the lack of a formal quantitative evaluation framework for sound morphing, we propose a set of metrics based on three established objective criteria. These metrics enable comprehensive assessment of morphed results and facilitate direct comparisons between methods, fostering advancements in sound morphing research. Extensive experiments demonstrate the effectiveness and versatility of SoundMorpher in real-world scenarios, showcasing its potential in applications such as creative music composition, film post-production, and interactive audio technologies. Our demonstration and codes are available at~\url{https://xinleiniu.github.io/SoundMorpher-demo/}.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.02144

Country:

Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)
North America > United States > Massachusetts > Plymouth County > Plymouth (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Europe > Czechia > South Moravian Region > Brno (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Do Large Language Models have Problem-Solving Capability under Incomplete Information Scenarios?

Chen, Yuyan, Yu, Tianhao, Li, Yueze, Yan, Songzhou, Liu, Sijia, Liang, Jiaqing, Xiao, Yanghua

arXiv.org Artificial IntelligenceSep-23-2024

The evaluation of the problem-solving capability under incomplete information scenarios of Large Language Models (LLMs) is increasingly important, encompassing capabilities such as questioning, knowledge search, error detection, and path planning. Current research mainly focus on LLMs' problem-solving capability such as ``Twenty Questions''. However, these kinds of games do not require recognizing misleading cues which are necessary in the incomplete information scenario. Moreover, the existing game such as ``Who is undercover'' are highly subjective, making it challenging for evaluation. Therefore, in this paper, we introduce a novel game named BrainKing based on the ``Who is undercover'' and ``Twenty Questions'' for evaluating LLM capabilities under incomplete information scenarios. It requires LLMs to identify target entities with limited yes-or-no questions and potential misleading answers. By setting up easy, medium, and hard difficulty modes, we comprehensively assess the performance of LLMs across various aspects. Our results reveal the capabilities and limitations of LLMs in BrainKing, providing significant insights of LLM problem-solving levels.

instrument, llm, musical instrument, (15 more...)

arXiv.org Artificial Intelligence

2409.14762

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > New York (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Media > Music (1.00)
Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback

A sound description: Exploring prompt templates and class descriptions to enhance zero-shot audio classification

Olvera, Michel, Stamatiadis, Paraskevas, Essid, Slim

arXiv.org Artificial IntelligenceSep-19-2024

Audio-text models trained via contrastive learning offer a practical approach to perform audio classification through natural language prompts, such as "this is a sound of" followed by category names. In this work, we explore alternative prompt templates for zero-shot audio classification, demonstrating the existence of higher-performing options. First, we find that the formatting of the prompts significantly affects performance so that simply prompting the models with properly formatted class labels performs competitively with optimized prompt templates and even prompt ensembling. Moreover, we look into complementing class labels by audio-centric descriptions. By leveraging large language models, we generate textual descriptions that prioritize acoustic features of sound events to disambiguate between classes, without extensive prompt engineering. We show that prompting with class descriptions leads to state-of-the-art results in zero-shot audio classification across major ambient sound datasets. Remarkably, this method requires no additional training and remains fully zero-shot.

class label, classification, dataset, (14 more...)

arXiv.org Artificial Intelligence

2409.13676

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.65)

Industry:

Media > Music (0.47)
Leisure & Entertainment (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Generating Sample-Based Musical Instruments Using Neural Audio Codec Language Models

Nercessian, Shahan, Imort, Johannes, Devis, Ninon, Blang, Frederik

arXiv.org Artificial IntelligenceJul-22-2024

In this paper, we propose and investigate the use of neural audio codec language models for the automatic generation of sample-based musical instruments based on text or reference audio prompts. Our approach extends a generative audio framework to condition on pitch across an 88-key spectrum, velocity, and a combined text/audio embedding. We identify maintaining timbral consistency within the generated instruments as a major challenge. To tackle this issue, we introduce three distinct conditioning schemes. We analyze our methods through objective metrics and human listening tests, demonstrating that our approach can produce compelling musical instruments. Specifically, we introduce a new objective metric to evaluate the timbral consistency of the generated instruments and adapt the average Contrastive Language-Audio Pretraining (CLAP) score for the text-to-instrument case, noting that its naive application is unsuitable for assessing this task. Our findings reveal a complex interplay between timbral consistency, the quality of generated samples, and their correspondence to the input prompt.

clap, instrument, proceedings, (16 more...)

arXiv.org Artificial Intelligence

2407.15641

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.84)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SPINACH: SPARQL-Based Information Navigation for Challenging Real-World Questions

Liu, Shicheng, Semnani, Sina J., Triedman, Harold, Xu, Jialiang, Zhao, Isaac Dan, Lam, Monica S.

arXiv.org Artificial IntelligenceJul-16-2024

Recent work integrating Large Language Models (LLMs) has led to significant improvements in the Knowledge Base Question Answering (KBQA) task. However, we posit that existing KBQA datasets that either have simple questions, use synthetically generated logical forms, or are based on small knowledge base (KB) schemas, do not capture the true complexity of KBQA tasks. To address this, we introduce the SPINACH dataset, an expert-annotated KBQA dataset collected from forum discussions on Wikidata's "Request a Query" forum with 320 decontextualized question-SPARQL pairs. Much more complex than existing datasets, SPINACH calls for strong KBQA systems that do not rely on training data to learn the KB schema, but can dynamically explore large and often incomplete schemas and reason about them. Along with the dataset, we introduce the SPINACH agent, a new KBQA approach that mimics how a human expert would write SPARQLs for such challenging questions. Experiments on existing datasets show SPINACH's capability in KBQA, achieving a new state of the art on the QALD-7, QALD-9 Plus and QALD-10 datasets by 30.1%, 27.0%, and 10.0% in F1, respectively, and coming within 1.6% of the fine-tuned LLaMA SOTA model on WikiWebQuestions. On our new SPINACH dataset, SPINACH agent outperforms all baselines, including the best GPT-4-based KBQA agent, by 38.1% in F1.

dataset, query, university, (13 more...)

arXiv.org Artificial Intelligence

2407.11417

Country:

Europe > Austria > Vienna (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > Dominican Republic (0.04)
(23 more...)

Genre: Research Report (0.81)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Music could be the secret to fighting off dementia, study says: 'Profound impact'

FOX NewsFeb-2-2024, 10:30:37 GMT

Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. There's nothing like a nostalgic song to transport you back to a special time and place -- and now a new study has shown that music could help protect those memories for a lifetime. Researchers at the University of Exeter discovered that people who "engage in music" over the course of their lives tend to have improved memory and better overall brain health as they age, according to a press release. The findings were published in the International Journal of Geriatric Psychiatry.

artificial intelligence, instrument, music, (14 more...)

FOX News

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Neurology > Dementia (0.54)

Technology: Information Technology > Artificial Intelligence > Cognitive Science (0.72)

Add feedback

On the Audio Hallucinations in Large Audio-Video Language Models

Nishimura, Taichi, Nakada, Shota, Kondo, Masayoshi

arXiv.org Artificial IntelligenceJan-18-2024

Large audio-video language models can generate descriptions for both video and audio. However, they sometimes ignore audio content, producing audio descriptions solely reliant on visual information. This paper refers to this as audio hallucinations and analyzes them in large audio-video language models. We gather 1,000 sentences by inquiring about audio information and annotate them whether they contain hallucinations. If a sentence is hallucinated, we also categorize the type of hallucination. The results reveal that 332 sentences are hallucinated with distinct trends observed in nouns and verbs for each hallucination type. Based on this, we tackle a task of audio hallucination classification using pre-trained audio-text models in the zero-shot and fine-tuning settings. Our experimental results reveal that the zero-shot models achieve higher performance (52.2% in F1) than the random (40.3%) and the fine-tuning models achieve 87.9%, outperforming the zero-shot models.

audio hallucination, hallucination, hallucination type, (14 more...)

arXiv.org Artificial Intelligence

2401.09774

Country: Europe > Monaco (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback